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Abstract 

We develop several analytical lower bounds on the capacity of binary insertion and deletion channels by 
considering independent uniformly distributed (i.u.d.) inputs and computing lower bounds on the mutual information 
between the input and the output sequences. For the deletion channel, we consider two different models: i.i.d. 
deletion-substitution channel and i.i.d. deletion channel with additive white Gaussian noise (AWGN). These two 
models are considered to incorporate effects of the channel noise along with the synchronization errors. For the 
insertion channel case we consider the Gallager's model in which the transmitted bits are replaced with two random 
bits and uniform over the four possibilities independently of any other insertion events. The general approach taken is 
similar in all cases, however the specific computations differ. Furthermore, the approach yields a useful lower bound 
on the capacity for a wide range of deletion probabilities for the deletion channels, while it provides a beneficial 
bound only for very low insertion probabilities for the insertion model adopted. We emphasize the importance 
of these results by noting that 1) our results are the first analytical bounds on the capacity of deletion- AWGN 
channels, 2) the results developed are the best available analytical lower bounds on the deletion/substitution case, 
3) for Gallager insertion channel model, the new lower bound improves the existing results for small insertion 
probabilities. 
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I. Introduction 

In modeling digital communication systems, we often assume that the transmitter and the receiver are completely 
synchronized; however, achieving a perfect time-alignment between the transmitter and receiver clocks is not possible 
in all communication systems and synchronization errors are unavoidable. A useful model for synchronization 
errors assumes that the number of received bits may be less or more than the number of transmitted bits. In other 
words, insertion/deletion channels may be used as appropriate models for communication channels that suffer from 
synchronization errors. Due to the memory introduced by the synchronization errors, an information theoretic study 
of these channels proves to be very challenging. For instance, even for seemingly simple models such as an i.i.d. 
deletion channel, an exact calculation of the capacity is not possible and only upper/lower bounds (which are often 
loose) are available. 

In this paper, we compute analytical lower bounds on the capacity of the i.i.d. deletion channel: with substitution 
errors and in the presence of additive white Gaussian noise (AWGN), and i.i.d. random insertion channel, by lower 
bounding the mutual information rate between the transmitted and received sequences for identically and uniformly 
distributed (i.u.d.) inputs. We particularly focus on the small insertion/deletion probabilities with the premise that 
such small values are more practical from an application point of view, where every bit is independently deleted 
with probability or replaced with two randomly chosen bits with probability pi, while neither the transmitter 
nor the receiver have any information about the positions of deletions and insertions, and undeleted bits are flipped 
with probability p^ and bits are received in the correct order. By a deletion-substitution channel we refer to an 
insertion/deletion channel with = 0; by a deletion- AWGN channel we refer to an insertion/deletion channel with 
Pi = Pe = (deletion-only channel) in which undeleted bits are received in the presence of AWGN, that can be 
modeled by a combination of a deletion-only channel with an AWGN channel such that every bit first goes through 
a deletion-only channel and then through an AWGN channel. Finally, by a random insertion channel we refer to 
an insertion/deletion channel with pd = Pe = 0. 



A. Review of Existing Results 

Dobrushin HI proved under very general conditions that for a memoryless channel with synchronization enors. 
Shannon's theorem on transmission rates applies and the information and transmission capacities are equal. The 
proof hinges on showing that information stability holds for the insertion/deletion channels and, as a result 111, 
capacity per bit of an i.i.d. insertion/deletion channel can be obtained by lim max —I{X; Y), where X and Y 
are the transmitted and received sequences, respectively, and N is the length of the transmitted sequence. On the 
other hand, there is no single-letter or finite-letter formulation which may be amenable for the capacity computation, 
and no results are available providing the exact value of the limit. 
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Gallager |[3l considered the use of convolutional codes over channels with synchronization errors, and derived an 
expression which represents an achievable rate for channels with insertion, deletion and substitution errors (whose 
model is specified earlier). The approach is to consider transmission of i.u.d. binary information sequences by 
convolutional coding and modulo-2 addition of a pseudo-random binary sequence (which could be considered as 
a watermark used for synchronization purposes), and computation of a rate that guarantees a successful decoding 
by sequential decoding. The achievable rate, or the capacity lower bound, is given by the expression 

C > 1 + Pdlog Pd + Pi log Pi + Pclogpc + Pslog{ps), (1) 

where C is the channel capacity, pc = {I — Pd — Pi){l — Pe) is the probability of correct reception, and ps = 
{I — Pd — Pi)pe is the probability that a flipped version of the transmitted bit is received. The logarithm is taken 
base 2 resulting in transmission rates in bits/channel use. By substituting = in Eqn. ([T]|, for pd < 0.5, a lower 
bound on the capacity of the deletion-substitution channel Cds, can be obtained as 

Cds>l- Hbipd) - (1 - Pd)HbiPe), (2) 

where Hi,{pd) = —pdlog{pd) — (1 — pd) log(l — pd) is the binary entropy function. It is interesting to note that 
for Pd = Pe = (j>i = Pe = 0) and pi < 0.5 (pd < 0.5), a lower bound on the capacity of the random insertion 
channel (deletion-only channel) with insertion (deletion) probability of pi (pd), is equal to the capacity of a binary 
symmetric channel with a substitution error probability of pi ipd)- 

In H |5|, authors argue that, since the deletion channel has memory, optimal codebooks for use over deletion 
channels should have memory. Therefore, in H |5] |6l H, achievable rates are computed by using a random 
codebook of rate R with 2"'^ codewords of length n, while each codeword is generated independently according 
to a symmetric first-order Markov process. Then, the generated codebook is used for transmission over the i.i.d. 
deletion channel. In the receiver, different decoding algorithms are proposed, e.g. in 141, if the number of codewords 
in the codebook that contain the received sequence as a subsequence is only one, the transmission is successful, 
otherwise an error is declared. The proposed decoding algorithms result in an upper bound for the incorrect decoding 
probability. Finally, the maximum value of R that results in a successful decoding as n — oo is an achievable rate, 
hence a lower bound on the transmission capacity of the deletion channel. The lower bound ([1]), for Pi = Pe = 0, is 
also proved in |4,] using a different approach compared to the one taken by Gallager ||3J, where the authors computed 
achievable rates by choosing codewords randomly, independently and uniformly among all possible codewords of 
a certain length. 

In m, a lower bound on the capacity of the deletion channel is directly obtained by lower bounding the information 
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capacity lim — max I{X;Y). Here, input sequences are considered as alternating blocks of zeros and ones 

N^oo N P(X) 

(runs), where the length of the runs L are i.i.d. random variables following a particular distribution over positive 
integers with a finite expectation and finite entropy {E{L),H{L) < oo where E{-) and H{-) denote expectation 
and entropy, respectively). 

There are also a few results on the capacity of the sticky channel in the literature 13 [3 111- In lI21lll^ the authors 
derive lower bounds by using the same approach employed for the deletion channel. Whereas in |9], several upper 
and lower bounds are obtained by resorting to the Blahut-Arimoto algorithm (BAA) in an appropriate manner. 

In ifTOl im . Monte Carlo methods are used for computing lower bounds on the capacity of the insertion/deletion 
channels based on reduced-state techniques. In [10], the input process is assumed to be a stationary Markov 
process and lower bounds on the capacity of the deletion and insertion channels are obtained via simulations, 
based on the first-order and the second-order Markov processes as inputs. In ifTTl . information rates for i.u.d. input 
sequences are computed for several channel models using a similar Monte Carlo approach where in addition to the 
insertions/deletions, effects of intersymbol interference (ISI) and AWGN are also investigated. 

There are several papers deriving upper bounds on the capacity of the insertion/deletion channels as well. Fertonani 
and Duman in [12] present several novel upper bounds on the capacity of the i.i.d. deletion channel by providing 
the decoder (and possibly the encoder) with some genie-aided information about the deletion process resulting in 
auxiliary channels whose capacities are certainly upper bounds on the capacity of the i.i.d. deletion channel. By 
providing the decoder with appropriate side information, a memoryless channel is obtained in such a way that BAA 
can be used for evaluating the capacity of the auxiliary channels (or, at least computing a provable upper bound on 
their capacities). They also prove that by subtracting some value from the derived upper bounds, lower bounds on 
the capacity can be derived. The intuition is that the subtracted information is more than extra information added by 
revealing certain aspects of the deletion process. A nontrivial upper bound on the deletion channel capacity is also 
obtained in llT3l where a different genie-aided decoder is considered. Furthermore, Fertonani and Duman in llT4l 
extend their work (TT\ to compute several upper and lower bounds on the capacity of channels with insertion, 
deletion and substitution errors as well. 

In two recent papers lITSl [TBI , asymptotic capacity expressions for the binary i.i.d. deletion channel for small 
deletion probabilities are developed. In |16|, the authors prove that < 1 — (1 — 0{pd))Hi^{p^) which clearly 
shows that for small deletion probabilities, 1 — Hh{pd) is a tight lower bound on the capacity of the deletion channel. 
In 021, an expansion of the capacity for small deletion probabilities is computed with several dominant terms in 
an explicit form. The interpretation of the main result is parallel to the one in | il6jl . 
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B. Contributions of the Paper 

In this paper, we focus on small insertion/deletion probabilities and derive analytical lower bounds on the capacity 
of the insertion/deletion channels by lower bounding the mutual information between i.u.d. input sequences and 
resulting output sequences. Since as shown in [IJ, for an insertion/deletion channel, the information and transmission 
capacities are equal justifying our approach in obtaining an achievable rate. 

We note that our idea is somewhat similar to the idea of directly lower bounding the information capacity instead 
of lower bounding the transmission capacity as employed in [8|. However, there are fundamental differences in 
the main methodology as will become apparent later. For instance, our approach provides a procedure that can 
easily be employed for many different channel models with synchronization errors as such we are able to consider 
deletion-substitution, deletion-AWGN and random insertion channels. Other differences include adopting a finite- 
length transmission which is proved to yield a lower bound on the capacity after subtracting some appropriate term, 
and the complexity in computing the final expression numerically is much lower in many versions of our results. 

Finally, we emphasize that the new approach and the obtained results in the existing literature are improved in 
several different aspects. In particular, the contributions of the paper include 

• development of a new approach for deriving achievable information rates for insertion/deletion channels, 

• the first analytical lower bound on the capacity of the deletion-AWGN channel, 

• tighter analytical lower bounds on the capacity of the deletion-substitution channel for all values of deletion 
and substitution probabilities compared to the existing analytical results, 

• tighter analytical lower bounds on the capacity of the random insertion channels for small values of insertion 
probabilities compared to the existing lower bounds, 

• very simple lower bounds on the capacity of several cases of insertion/deletion channels. 

Regarding the final point, we note that by employing pe = in the results on the deletion-substitution channel, we 
arrive at lower bounds on the capacity of the deletion-only channel which are in agreement with the asymptotic 
results of lITSl [T6l in the sense of capturing the dominant terms in the capacity expansion. Our results, however, 
are provable lower bounds on the capacity, while the existing asymptotic results are not amenable for numerical 
calculation (as they contain big-0 terms). 

C. Notation 

We denote a finite binary sequence of length n with K runs by (6; ni, n2, ...^uk), where b G {0, 1} denotes the 
first run type and XlfeLi '^fc = n. For example, the sequence 001111011000 can be represented as (0;2,4,1,2,3). We 
use four different ways to denote different sequences; x{b; n^;K'-^) represents every sequence belonging to the set of 
sequences of length with runs and by the first run of type b, x{b; n^; K^; I) represents a sequence x{b; n^; K^) 



which has / runs of length one (/ = J2k=i^(^k ~ 1) where 6{.) denotes the Kronecker delta function), x{ 
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represents every sequence of length n^, and x represents every possible sequence. The set of all input sequences 
is shown by X, and the set of output sequences of the deletion-only, and random insertion channels are shown by 
y^, y^, respectively, ^^'^ denote the set of output sequences resulting from a deletions and c random 
insertions, respectively, and y'^{x — a) and y^{x + c) denote the set of output sequences resulting from a deletions 
from and c random insertions into, the input sequence x, respectively. We denote the deletion pattern of length d 
in a sequence of length n with K runs by D{n; K; d) = (di, d2, dx), where dk denotes the number of deletions 
in the k-th run and X]fe=i = d.The outputs resulting from a given deletion pattern D{n; K; d) = {di, d2, ■■■,dK) 
(without any other error)is denoted by D{n; K; d) * x{n; K) = (ni — di,n2 — d2, "-x — dx)- The sets V^{d) 
represents the set of all deletion patterns of length d of a sequence of length n and with K runs. 

D. Organization of the Paper 

In Section |llj we introduce our general approach for lower bounding the mutual information of the input and 
output sequences for insertion/deletion channels. In Section |lll| we apply the introduced approach to the deletion- 
substitution and deletion-AWGN channels and present analytical lower bounds on their capacities, and compare the 
resulting expressions with earlier results. In Section [iVj we provide lower bounds on the capacity of the random 
insertion channels and comment on our results with respect to the existing literature. In Section [V| we compute the 



lower bounds for a number of insertion/deletion channels, and finally, we provide our conclusions in Section VI 



II. Main Approach 

We rely on lower bounding the information capacity of memoryless channels with insertion or deletion errors 
directly as justified by [1], where it is shown that, for a memoryless channel with synchronization errors, the 
Shannon's theorem on transmission rates applies and the information and transmission capacities are equal, and 
thus every lower bound on the information capacity of an insertion/deletion channel is a lower bound on the 
transmission capacity of the channel. Our approach is different than most existing work on finding lower bounds on 
the capacity of the insertion/deletion channels where typically the transmission capacity is lower bounded using a 
certain codebook and particular decoding algorithms. The idea we employ is similar to the work in |8| which also 
considers the information capacity lim — max /(X; Y) and directly lower bounds it using a particular input 

Af-s>oo N p(X) 

distribution to arrive at an achievable rate result. 

Our primary focus is on the small deletion and insertion probabilities. As also noted in [15 1, for such probabilities 
it is natural to consider binary i.u.d. input distribution. This is justified by noting that when Pd = Pi = 0, i.e., for 
a binary symmetric channel, the capacity is achieved with independent and symmetric binary inputs, and hence 
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we expect that for small deletion/insertion probabilities, binary i.u.d. inputs are not far from the optimal input 
distribution. 

Our methodology is to consider a finite length transmission of i.u.d. bits over the insertion/deletion channel, 
and to compute (more precisely, lower bound) the mutual information between the input and the resulting output 
sequences. As proved in |[T2l for a channel with deletion errors, such a finite length transmission in fact results in 
an upper bound on the mutual information supported by the insertion/deletion channels; however, as also shown 
in ifTll . if a suitable term is subtracted from the mutual information, a provable lower bound on the achievable 
rate, hence the channel capacity, results. The following theorem provides this result in a slightly generalized form 
compared to ifTlll . 

Theorem 1. For binary input channels with i.i.d. insertion or deletion errors, for any input distribution and any 
n > 0, the channel capacity C can be lower bounded by 

C>-I{X;Y)--H{T), (3) 
n n 



where H{T) = -YJj=o (j)F'(l - P)""-' log (^(pp^^l - p)""-' j with the understanding that p = pd for the 
deletion channel case and p = pi in the insertion channel case, and n is the length of the input sequence X. 

Proof: This is a slight generalization of a result in ifTll which shows that Eqn. Q is valid for the i.i.d. deletion 
channel. It is easy to see that |[T2l . for any random process T^, and for any input distribution P(X^), we have 

C > lim l/(X^;r^,T^) - lim ^i/(T^), (4) 

where C is the capacity of the channel, is the length of the input sequence and N = Qn, i.e., the input 
bits in both insertion and deletion channels are divided into Q blocks of length n (X^ = {Xj}f^^). We define 
the random process in the following manner. For an i.i.d. insertion channel, T^'* is formed as the sequence 
rpN,i _ {T?}^^ which denotes the number of insertions that occur in each block of length n transmission. For 



a deletion channel, T^''^ = {Tj^l^^ represents the number of deletions occurring in transmission of each block. 



Since insertions (deletions) for different blocks are independent, the random variables Tj = Tj (Tj^) j = 1,...,Q 
are i.i.d., and transmission of different blocks are independent. Therefore, we can rewrite Eqn. (|4]) as 

C > ll{X,;Y,)-^H{T,) 

= -I{X;Y)--H{T). (5) 
n n 

Noting that the random variable denoting the number of deletions or insertions as a result of n bit transmission is 
binomial with parameters n and pd (or, pi) the result follows. ■ 
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Several comments on the specific calculations involved are in order. Theorem [1] shows that for any input distribution 
and any transmission length, Eqn. Q results in a lower bound on the capacity of the channel with deletion or 
insertion errors. Therefore, employing any lower bound on the mutual information rate ^J{X; Y) in Eqn. Q also 
results in a lower bound on the capacity of the insertion/deletion channel. Due to the fact that obtaining the exact 
value of the mutual information rate for any n is infeasible, we first derive a lower bound on the mutual information 
rate for i.u.d. input sequences and then employ it in Eqn. Q. Based on the formulation of the mutual information, 
obviously 

I{X;Y) = H{Y)-H{Y\X), (6) 

thus by calculating the exact value of the output entropy or lower bounding it and obtaining the exact value of 
the conditional output entropy or upper bounding it, the mutual information is lower bounded. For the models 
adopted in this paper, we are able to obtain the exact value of the output sequence probability distribution when 
i.u.d. input sequences are used, hence the exact value of the output entropy (the differential output entropy for the 
deletion-AWGN channel) is available. 

In deriving the conditional output entropies (the conditional differential entropy of the output sequence for the 
deletion-AWGN channel), we cannot obtain the exact probability of all the possible output sequences conditioned 
on a given input sequence. For deletion channels, we compute the probability of all possible deletion patterns for a 
given input sequence, and treat the resulting sequences as if they are all distinct to find a provable upper bound on 
the conditional entropy term. Clearly, we are losing some tightness, as different deletion patterns may result in the 
same sequence at the channel output. For the random insertion channel, we calculate the conditional probability of 
the output sequences resulting from at most one insertion, and derive an upper bound on the part of the conditional 
output entropy expression that results from the output sequences with multiple insertions. 

III. Lower Bounds on the Capacity of Noisy Deletion Channels 

As mentioned earlier, we consider two different variations of the binary deletion channel: i.i.d. deletion and 
substitution channel (deletion-substitution channel), and i.i.d. deletion channel in the presence of AWGN (deletion- 
AWGN channel). The results utilize the idea and approach of the previous section. We first give the results for the 
deletion-substitution channel, then for the deletion-AWGN channel. We note that the presented lower bounds can 
be also employed on the deletion-only channel if pe = (or = for the deletion-AWGN channel). 

A. Deletion-Substitution Channel 

In this section, we consider a binary deletion channel with substitution errors in which each bit is independently 
deleted with probability pd, and transmitted bits are independently flipped with probability pe- The receiver and 
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Fig. 1. Deletion-substitution channel. 



the transmitter do not have any information about the position of deletions or the transmission errors. As shown 
in Fig. [T] this channel can be considered as a cascade of an i.i.d. deletion channel with a deletion probability pd 
and output sequence Y , and a binary symmetric channel (BSC) with a cross-over error probability pe and output 
sequence Y' . For such a channel model the following lemma is a lower bound on the capacity. 

Lemma 1. For any n > 0, the capacity of the i.i.d. deletion-substitution channel Cds, with a substitution probability 
Pe and a deletion probability pd, is lower bounded by 

1 " / \ 

Ccis>l-Pd- Hbipd) + W,{n) - pdT-^ - (1 - Pd)Hb{pe), (7) 

where Hb{pd) = -pdlog{pd) - (1 -pd)log(l - pd) and 

n-l j 



Wj{n) 



□ 

Before proving the lemma, we would like to emphasize that the only existing analytical lower bound on 
the capacity of deletion-substitution channels is derived in ||3l (Eqn. (|2])). In comparing the lower bound in 
Eqn. (|2]) with the lower bound in Eqn. Q, we observe that the new lower bound improves the previous one 
by ^ J2]=i ^ji^){^)Pd(.^ —Pd)^~-' — Pd, which is guaranteed to be positive. 

A simplified form of the lower bound for small values of deletion probability can also be presented. By invoking 
the inequalities (1 — p)™ > [1 — mp+ {^)p'^ — ('3)^'^] and (1 — p)"^ > 1 — mp, and ignoring some positive terms 
— Pd)"'^-' for j > 3), we can write 

Cd > l-Hk{pd)+Pd{Wi{n)-l)+pl'^{W2{n)-2Wi{n)) 

2 ^) - W^2(n)) ~ ^^W,{n). (9) 

By utilizing = in Eqn. ([7]), we can obtain a lower bound on the capacity of the deletion-only channel as 
given in the following corollary. 

Corollary 1. For any n > 0, the capacity of an i.i.d. deletion channel Cd, with a deletion probability of pd is 
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lower bounded by 



Pd 



\n-J 



(10) 



We also would like to make a few comments on the result of the Corollary [T] First of all, the lower bound 10 is tighter 
than the one proved in f3l (Eqn. ([T]) with pi = Pe = 0) which is the simplest analytical lower bound on the capacity of 
the deletion channel. The amount of improvement in (10) over the one in ([T]) is ^ Sj=i ^j{''^){^)Pdi^~Pd)^~'^ —Pd, 
which is guaranteed to be positive. 
In lITSl . it is shown that 



Cd = l+Pd log(pd) - Aipd + 0{pd 



1.4\ 



(11) 



where Ai = log(2e) — Y^^i 2 ' ^^log(/), and 0(p") represents the standard Landau (big O) notation. A similar 
result in |[T6ll is provided, that is 

Cd<l-{1- 0{pd))HM, (12) 



which shows that 1 — Hi,{pd) is a tight lower bound for small deletion probabilities. If we consider the new capacity 
lower bound in ( [T0| ), and represent (1 — pd) log(l — pd) by its Taylor series expansion, we can readily write 



Cd > 1+Pdlogipd) - (log(2e) - Wi{n))pd+pjf{n,pd), 



(13) 



where f{n,pd) is a polynomial function. On the other hand for Wi{n), if we let n go to infinity, we have 



lim Wi{n) 



lim 



n-1 



1=1 



iE2--(n-/ + 3)nog(0 + ^^ 



£2-'-i/log(0. 



(14) 



1=1 



Therefore, we observe that the lower bound ( [T0| ) captures the first order term of the capacity expansion ([TT|. This 
is an important result as the the capacity expansions in |[T5l [T6l are asymptotic and do not lend themselves for a 
numerical calculation of the transmission rates for any non-zero value of the deletion probability. 

We need the following two propositions in the proof of Lemma [T] In Proposition [T] we obtain the exact value 
of the output entropy in the deletion-substitution channel with i.u.d. input sequences, while Proposition |2] gives an 
upper bound on the conditional output entropy with i.u.d. bits transmitted through the deletion-substitution channel. 

Proposition 1. For an i.i.d. deletion- substitution channel with i.u.d. input sequences of length n, we have 



H{Y') = n{l-pd) + H{T) 



(15) 



where Y' denotes the output sequence of the deletion-substitution channel and H{T) is as defined in Eqn. (|3]). 
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Proof: By using the facts that all the elements of the set y^- are identically distributed, which are inputs into 
the BSC channel, and a fixed length i.u.d. input sequence into a BSC result in i.u.d. output sequences, all elements 
of the set y"Lj are also identically distributed. Hence, 

P{y'{n-j)) = (16) 

where — Pd)^~^ is the probability of exactly j deletions occurring in n use of the channel. Therefore, we 

obtain 

H{Y') = Y,-P{y')\og{P{y')) 

y 

= nil-pd)+H{T). (17) 



Proposition 2. For a deletion-substitution channel with i.u.d. input sequences, the entropy of the output sequence 
Y' conditioned on the input X of length n bits, is upper bounded by 

H{Y'\X) < nHbipd) -J2Wjin)('')p^{l-pdr-^ + nil-pd)Hbipe), (18) 

j=i V-^/ 

where Wj{n) is given in Eqn. (|8). 

Proof: To obtain the conditional output entropy, we need to compute the probability of all possible output 
sequences resulting from every possible input sequence x {P{Y'\x)). For a given x = (6;ni,n2, ■■■,nk) and for a 
specific deletion pattern K]j) = (ji, ...^jK) in which jk denotes the number of deletions in the A;-th run, we 
can write 

Furthermore, for every D{n; K; d), we can write 



\n—d—e 

P{y'\D{n; K; d) * x{n] K)) = { 



pI{1 — Pe)^ if \y'\=n — d 



(20) 



otherwise 

where e = dn {y'\ D{n; K; d) * x{n; K)), and dn = (a; h) is the Hamming distance between two sequences a and 



12 



b. On the other hand, for every output sequence of length n — d, conditioned on a given input x{n; K), we have 



D,x{n;K) ] P{D 



x{n; K) 



(21) 



P (y {n - d) x{n-, KU = ^ p(y'{n-d) 

However, there is a difficulty as two different possible deletion patterns, D{n; K; j) = (ii, • ■ ■ i Jk) and D'{n; K; j) = 
(jij ■ ■ ■ ; Ja")' under the same substitution error pattern (i.e., the substitution errors occur at the same positions on 
D{n; K; j) *x(n; K) and D'{n; K; j) *x{n, K)), may convert a given input sequence x{n; K) into the same output 
sequence, i.e., D{n;K;j) * x{n; K) = D'{n;K;j) * x{n,K). This occurs when successive runs are completely 
deleted, for example, in transmitting (1; 2, 1, 2, 3, 2) = 1101100011, if the second, third and fourth runs are 
completely deleted, by deleting one bit from the first run, (1, 1, 2, 3, 0) * (1; 2, 1, 2, 3, 2) = (1; 1, 0, 0, 0, 2) = 111, or 
from the last run, (0, 1, 2, 3, 1) * (1; 2, 1, 2, 3, 2) = (1; 2, 0, 0, 0, 1) = 111, the same output sequences are obtained. 
This difficulty can be addressed using 



^ -pt I log^pf I < ^ -ptlogipt), 
t \ t / t 



which is trivially valid for any set of probabilities {pi, ...,pt, ■■■)■ Therefore, we can write 



(22) 



~Piy'\x)log {P{y'\x)) 



= - P{y'\D *x)P {D\x) log I P{y'\D' *x)P{D'\x) 

< - Y Piy'\D*x)P{D\x)\og{P{y'\D*x)P{D\x)) . 



Devoid) 

Hence, for a specific x{b;n; K^) = {b;nf,...,n^^), we obtain (for more details see Appendix [A 

n j 



h[y' 



x{b;n;K-)]=nHM+n{l-pd)Ht{pe)-J2Pd(^-PdT~'T.12 



j=0 



k=ljk=0 



3k / \3 - 3K 



''^ ^ log ( 

3k 



Therefore, by considering i.u.d. input sequences, we have 
H{Y'\X) = Y.^H{Y'\x) 



xex 



< nHbipd) + n{l - pd)Hb{pe) - ^ 

3=0 



j 



(23) 



(24) 



On the other hand, we can write 

j 



xex k=ijk=o ^■'"^ ^-^ ""^ 



3k 



2" 



J 71 



EEE 

xex k=i jk=o 



3k J \3 -3k 



3k 



j'=0 1=1 



(26) 



where Pr{1, n) denotes the probability of having a run of length of I in an input sequence of length n. It is obvious 
that PR{l,n) = Jr- Due to the fact, for 1 < Z < n — 1, there are ("^i^^) possibilities to have a run of length / in 
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Y 
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Fig. 2. Deletion-AWGN channel. 



a sequence with K runs, we can write 

m,n) = ^ Y: r^'_Y)K = 2-^-Hn-l + 3). ill) 

K=2 



n-l+l , , 

2 v-^ n — I — \ 



Finally, by substituting Eqns. ( |26| ) and ( |27| ) in Eqn. p5] ), Eqn. ( [T8| ) results, completing the proof. ■ 
We can now complete the proof of the main lemma of the section. 

Proof of Lemma^ In Theorem [T] we showed that for any input distribution and any transmission length, Eqn. (|3]) 
results in a lower bound on the capacity of the channel with i.i.d. deletion eiTors. On the other hand, any lower 
bound on the information rate can also be used to derive lower bound on the capacity. Due to the definition of the 
mutual information, Eqn. Q, by obtaining the exact value of the output entropy (Proposition [T]) and upper bounding 
the conditional output entropy (Proposition |2]) the mutual information is lower bounded. Finally, by substituting 



Eqns. ( [TS] ) and ( [TS] ) into Eqn. Q, Lemma [T] is proved. □ 



B. Deletion-AWGN Channel 

In this section, a binary deletion channel in the presence of AWGN is considered, where the bits are transmitted 
using binary phase shift keying (BPSK) and the received signal contains AWGN in addition to the deletion errors. 
As illustrated in Fig. [2] this channel can be considered as a cascade of two independent channels where the first 
channel is an i.i.d. deletion channel and the second one is a binary input AWGN (BI-AWGN) channel. We use X 
to denote the input sequence to the first channel which is a BPSK modulated version of the binary input sequence 
X, i.e., Xi = 1 — 2xi and Y to denote the output sequence of the first channel input to the second one. Y is the 
output sequence of the second channel that is the noisy version of 1^, i.e., 

yf = yf + zi, (28) 

where zi's are i.i.d. Gaussian random variables with zero mean and a variance of cr^, and yf and yf are the i^^ 
received and transmitted bits of the second channel, respectively. Therefore, for the probability density function of 
the i^'^ channel output, we have 



fyfiri) = fyf{ri\yf = l)P{yf = l) + fy.{r,\yf = -l)P{yf = -l) 
P{yf = l)e-^ + P{yf = -1)6"^ 



27r(T 



(29) 
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In the following lemma, an achievable rate is provided over this channel. 

Lemma 2. For any n > 0, the capacity of the deletion-AWGN channel with a deletion probability of pd and a 
noise variance of cj^ is lower bounded by 



CdAWGN >l-Pd- Hkipd) + IY. WM){^-)p'dO- - PdT'' - (1 - Pd)E 



log(l + e^) 



(30) 



where Wj{n) is as given in Eqn. ([8]), E[.] is statistical expectation, and z ~ A/'(0, o"^). 



□ 



Before giving the proof of the above lemma, we provide several comments about the result. First, the desired 
lower bound in Eqn. ( |30l ) is the only analytical lower bound on the capacity of the deletion-AWGN channel. In the 
current literature, there are only simulation based lower bounds, e.g. %\\\ which employs Monte-Carlo simulation 
techniques. Furthermore, the procedure employed in [llj is only useful for deriving lower bounds for small values 
of deletion probability, e.g. pd < 0.1, while the lower bound in Eqn. ( [BO] ) is useful for a much wider range. 



log(l + e 



which is the capacity of the BI- 



log 1 



can be 



For Pd = 0, the lower bound in Eqn. ( [30| ) is equal to 1 — 
AWGN channel IITtI p. 362]. Finally, we note that the term in Eqn. ([30]) which contains E 
easily computed by numerical integration with an arbitrary accuracy (it involves only an one-dimensional integral). 

We need the following two propositions in the proof of Lemma |2] In the following proposition, the exact value 
of the differential output entropy in the deletion-AWGN channel with i.u.d. input bits is calculated. 



Proposition 3. For an i.i.d. deletion-AWGN channel with i.u.d. input sequences of length n, we have 



h{Y) = n{l - Pd) (log{2aV2ire) - E log(l + e-^) ] + H{T), 



iZ , 



(31) 



where h{.) denotes the differential entropy function, Y denotes the output of the deletion-AWGN channel, z ~ A/'(0, cr^), 
and H{T) is as defined in Eqn. Q. 

Proof: For the differential entropy of the output sequence, we can write 



h{Y) = h{Y)+ H{T\Y) 
= h{Y,T) 
= h{Y\T) + H{T), 



(32) 



where the first equality results by using the fact that by knowing the received sequence, the number of deletions is 
known and T is determined, i.e., H{T\Y) = 0, and the last equality is obtained by using a different expansion of 
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h{Y, T). On the other hand, we can write 

n 

h{Y\T) = ^h{Y\T = d)P{T = d) 

= j2HY\T = d)(%i{i-p,r-''. 

Due to the fact that all the elements of the set 3>1^ are i.i.d., we have P{y{n-d)) = P{y, Tj = d) = (d)Pd(l 
Pd)"'~'^- Therefore, we can write 

P{y,T = d) 



(33) 



P{y\T = d) 



P{T = d) 
1 



(34) 



and as a result P{yf = l|Tj = d) = P{yf = —^Tj = d) = ^ (for 1 < i < n — d). By employing this result 

fyfiv) 



m 



Eqn. (29 1, we have 



1 



2V27r(T 



e 2<T^ -|- e 2<t2 



(35) 



where fyd{ri) denotes the probabiUty density function (PDF) of the continuous random variable yf. Noting also 
that the deletions happen independently, yf's are i.i.d. and we can write 



h(Y\T = d) 



{n-d)h{yf) 

POO 

(n-d) / -fy^ {rf) log [fyd {t])) drj 



in 



d) (^log(2o-\/2^) - E 



logfl + e' 



(36) 



where z ~ A/'(0,(T ). By substituting Eqn. (|36]l into Eqn. ([33]), we obtain 



d=0 



h{Y\T) = J](n-d)rU^(l-prf)"-'^(log(2aV2^)-i?[log(l + e-^) 



n{l - pd) [log{2aV2Tre) - E log(l + e-^) 



2Z , 



(37) 



and by using Eqns. ( |37| ) and ( |32| ), Eqn. pT) is obtained. ■ 
In the following proposition, we derive an upper bound on the differential entropy of the output conditioned on 
the input for deletion-AWGN channel. 

Proposition 4. For a deletion-AWGN channel with i.u.d. input bits, the differential entropy of the output sequence 
Y conditioned on the input X of length n, is upper bounded by 



h{Y\X) < nFfe(prf) -^T^,(n)r''V'd(l -Pd)"-^' + -prf)log(2a^/2^), 
i=i ^ 



(38) 
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where Wj{n) is given in Eqn. ([8]l. 

Proof: For the conditional differential entropy of the output sequence given the length n input X, we can 
write 

h{Y\X) = h{Y\X) + H{T\Y,X) 

= H{T) + h{Y\T,X), (39) 

where in the first equality we used the fact that by knowing X and Y , the number of deletions is known, i.e., 
H{T\Y,X) = 0. The second equality is obtained by using a different expansion of h{Y ,T\X) and also using 
the fact that the deletion process is independent of the input X, i.e., H{T\X) = H{T). On the other hand, we 
also have 

n 

h{Y\T,X) = ^h{Y\X,T = d)P{T = d) 

d=0 

= pHY\X,T = d)(^fjpi{l-p,r-''. (40) 

To obtain h(Y\X,T = d), we need to compute fy\xdi''l) ^'^y given input sequence x = (6; ni, n2, nx) 
and different values of d. As in the proofs of Proposition |2| if we consider the outputs of the deletion channel 
resulting from different deletion patterns of length d from a given x, as if they are distinct and also use the result 
in Eqn. (|22]), an upper bound on the differential output entropy conditioned on the input sequence X results. We 
relegate the details of this computation and completion of the proof of the proposition to Appendix |B] ■ 

We can now state the proof of the main lemma of the section. 

Proof of Lemma^: By substituting the exact value of the differential output entropy in Eqn. ( [3T] ), and the upper 
bound on the differential output entropy conditioned on the input in Eqn. ([38]l, in Eqn. (|6]l a lower bound on the 
mutual information rate of a deletion-AWGN channel is obtained, hence the lemma is proved. 

□ 

IV. Lower Bounds on the Capacity of Random Insertion Channels 

We now turn our attention to the random insertion channels and derive lower bounds on the capacity of random 
insertion channels by employing the approach proposed in Section [II] We consider the Gallager model [3| for 
insertion channels in which every transmitted bit is independently replaced by two random bits with probability of 
Pi while neither the receiver nor the transmitter have information about the position of the insertions. The following 
lemma provides the main results of this section. 
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Lemma 3. For any n > 0, the capacity of the random insertion channel Ci, is lower bounded by 

a > {l-p.r-HM+(^S3{n)-^^+r)jpi{l-p,r-' 

+ 1 (1 - (1 - P^r - np^{l - p^r-' -p7- np^-\l - p,)) log (^^fcil^ 
+p'y-\l-pi)\og{n), (41) 

where S^{n) = ^ Z?=i 2"' [(n + 1 - + 2) log(/ + 2) + 2{l + 1) log(Z + 1)] + 

□ 

To the best of our knowledge, the only analytical lower bound on the capacity of the random insertion channel 
is derived in Q (i.e., Eqn. ([1]) for Pd = Pe = 0). Our result improves upon this result for small values of insertion 
probabilities as will be apparent with numerical examples. 

Similar to the deletion-substitution channel case, we can write a simpler lower bound as 

a >1 - H,{p,) + (^Ss{n) - - ^(253(n) - ^ + n - log Q 

n — l\ /n\ ^ / X 2n 3n + 1 , o fn — l\ f ^ , . 3n + 1 \ . ^ 



For instance, for n = 10, Eqn. (42 1 evaluates to 



Ci>l- Hbipi) + 1.1591pi - 30.7184pf + 1.0502 x lO^p,^ - 1.3391 x 10^. (43) 

To prove the above lemma, we need the following two propositions. The output entropy of the random insertion 
channel with i.u.d. input sequences is calculated in the first one. 

Proposition 5. For a random insertion channel with i.u.d. input sequences of length n, we have 

H{Y) = n{l+p,) + H{T). (44) 

where Y denotes the output sequence and H{T) is as defined in Eqn. Q. 
Proof: Similar to the proof of Proposition [T] we use the fact that 

P{y{n + j)) = f ""V' a - P^T-'■ (45) 



Therefore, by employing Eqn. (45 1 in computing the output entropy, we obtain 



Hix) = -E(")i'Ki-p.r^iog(^(")pKi-P.r-^- 



3=0 

n{l+p,) + H{T). (46) 
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In the following proposition, we present an upper bound on the conditional output entropy of the random insertion 
channel with i.u.d. input sequences for a given input of length n. 

Proposition 6. For a random insertion channel with input and output sequences denoted by X and Y, respectively, 
with i.u.d. input sequences of length n, we have 



I 3^ _|_ T \ 

H{Y\X) < n{l+p^)-n{l-pir + nHb{p^)-n{S3{n)-—— + n\pi{l-pi) 



n-l 



4n 



- (1 - (1 - p^r - np^{l - PiT-^ - - np'l-\l - Pi)) log 
-np""^(l -pi)log(n), 



n{n — 1) 



(47) 



where S3{n) is given in Eqn. ^4T\ . 



Proof: For the conditional output sequence distribution for a given input sequence, we can write 



p{y\x{b;n;K)) 



il-p^r 

r^p,{l-p,)n~i 

tlM+lp,{l - p,)n-l 
lpi{l-p,)n-^ 
lpi{l-p,)n-^ 



y = x{b;n;K) 
y = (6;ni + 1, ...,nj^) 
y = ...,nK + 1) 

y = {b;ni, ...,nk + 1, ...,nK)il < k < K) 
y = {b■,nl,...,n'^-^^,2,n'^2^...,nK) 
y = (6;ni,...,n'^pl,n'^'2,...,"ii-) 
y = (b; l,ni, ...,nk, ■.■,nK) 

y = {b;ni,...,nk,...,nK,l) 
\y\>n + 2 



(48) 



where n'^, +n'^^2 = "fc - 1 "-1,2 - 0)' K,i+K,2 = "^k ("-fe,!' ^'k,2 - !)> ^y,x represents p{y\x{b; n; K)) 

for given y with \y\ > 2. Hence, we obtain 



H{Y\x{b;n;K^)) 



1 - pi f log(l - PiT - Pii^ - PiT'^ ( n\og{pi{l - piT-') - 1.5n - 0.5K^' 



\n—l\ 



k(1 - P^r^' {n\ + 1) log« + 1) + {n%, + 1) log(n^. + 1) 



(49) 



k=2 



where is the term related to the outputs resulting from more than one insertion. Therefore, by considering 



19 



i.u.d. input sequences, we have 



H{Y\X) = -(l-pi)"log(l-p,)"-npi(l-p,)""' log(pi(l - p. 



^ ' An 



+ S^{n)^ +H,4X), (50) 



where H,,i{X) = Y^xex 



and 



which can be written as 



(nf + 1) log« + 1) + {nl. + 1) log(n^. + i) + J] + 2) log(n^' + 2) 



k=2 



+ 



log(n) 

2n+l ' 



2"+2 



n 



J]J]K + 2)log(n^ + 2) + 2 [K + l)logK + l)-K + 2)logK + 2)] 



X k=l 



+ ^ = ^ E 2-' [(n + 1 - 0(/ + 2) log(Z + 2) + 2{l + 1) log(/ + 1)] + (51) 
1=1 

Here we have used the same approach used in the proof of Proposition [2j and considered the fact that there are 
2"^' sequences of length n with ni = / or uk = I. 

If we assume that all the possible outputs resulting from i insertions (i > 2) for a given x are equiprobable, 
since 



^Pjlogipj) <- [Ypj] log 



we can upper bound H^^i{x). That is, 



(52) 



i=2 ye(x,i) ^ i=2 



< 



\{x,^)\ 



i=2 



2n+i 



(53) 



where ei = J2y£{x i) Q{y\x) — ~ Pi)^~^ probability of i insertions for transmission of n bits, and 

the last inequality results by using the fact that \y^{x + j)\ < 2"+-', where |3^*(cc + j)\ denotes the number of 
output sequences resulting from j insertions into a given input sequence x. After some algebra, we arrive at 

H.^iiX) < n(l + p,) + nH,{p,) - n{l - pi f - (n + l)npi{l - piT~^ + (1 - p,)" log(l - PiT 
+npi{l - pi^-^ logPi(l - PiT~^ - np^-^{l - Pi) log(n) 

-(1 - pr - (1 - P^T - npi{l - PiT-' - np^-~\l - Pi)) log(n(n - l)/2). (54) 



Finally, by substituting Eqn. p4} into Eqn. (pOjl, the upper bound (47i is obtained 



Proof of Lemma |5] : By substituting the exact value of the output entropy (Eqn. (|44]l) and the upper bound on 
the conditional output entropy (Eqn. ( [47| )) of the random insertion channel with i.u.d. input sequences into Eqn. Q, 
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a lower bound on the achievable information rate is obtained, hence the lemma is proved. 



□ 



V. Numerical Examples 

We now present several examples of the lower bounds on the insertion/deletion channel capacity for different 
values of n and compare them with the existing ones in the Uterature. 



A. Deletion-Substitution Channel 

In Table [ij we compare the lower bound ([7]) for n = 100 and n = 1000 with the one in ||3l. We observe that the 
bound improves the result of Q for the entire range of and p^, and also as we expected, by increasing n from 
100 to 1000, a tighter lower bound for all values of pd and pe is obtained. 

TABLE I 

Lower bounds on the capacity of the deletion-substitution channel (In the left table "1-lower bound" is 

reported) 



Pd 


Pe 


1-LB ^ 


1-LB 
n = 1000 


1-LB 
n = 100 


10-^ 


10-5 


3.6104 X 10-4 


3.5817 X 10-4 


3.5834 X 10-4 


10-5 


10-4 


1.6535 X 10-3 


1.6506 X 10-3 


1.6508 X 10-3 


10-5 


10-3 


1.15881 X 10-^ 


1.15853 X 10-^ 


1.15854 X 10-^ 


10-4 


10-5 


1.6535 X 10-3 


1.6248 X 10-3 


1.6264 X 10-3 


10-4 


10-4 


2.9459 X 10-3 


2.9172 X 10-3 


2.9188 X 10-3 


10-4 


10-3 


1.2879 X 10-^ 


1.2850 X 10-^ 


1.2852 X 10-^ 


10-3 


10-5 


1.1588 X 10-^ 


1.1302 X 10-^ 


1.1319 X 10-^ 


10-3 


10-4 


1.2879 X 10-^ 


1.2593 X 10-^ 


1.261 X 10-^ 


10-3 


10-3 


2.2804 X 10-^ 


2.2518 X 10-^ 


2.2535 X 10-^ 



Pd 


Pe 


LB 


LB 

n = 1000 


LB 

n = 100 


0.01 


0.01 


0.8392 


0.8419 


0.8418 


0.01 


0.03 


0.7268 


0.7373 


0.7293 


0.01 


0.1 


0.4549 


0.4576 


0.4575 


0.05 


0.01 


0.6368 


0.6476 


0.6469 


0.05 


0.03 


0.5289 


0.5397 


0.5390 


0.05 


0.1 


0.2681 


0.2789 


0.2781 


0.1 


0.01 


0.4583 


0.4729 


0.4716 


0.1 


0.03 


0.3561 


0.3707 


0.3693 


0.1 


0.1 


0.1089 


0.1236 


0.1222 



B. Deletion-AWGN Channel 

We now compare the derived analytical lower bound on the capacity of the deletion-AWGN channel with the 
simulation based bound of 111] which is the achievable information rate of the deletion-AWGN channel for i.u.d. 
input sequences obtained by Monte-Carlo simulations. As we observe in Fig. [3} the lower bound (30 1 is very close 
to the simulation results of 111] for small values of deletion probability but it does not improve them. This is not 
unexpected, because we further lower bounded the achievable information rate for i.u.d. input sequences while in 
iim . the achievable information rate for i.u.d. input sequences is obtained by Monte-Carlo simulations without any 
further lower bounding. On the other hand, new bound is provablr, analytical and very easy to compute while the 
result in liTTI requires lengthly simulations. Furthermore, the procedure employed in ifTTI is only useful for deriving 
lower bounds for small values of deletion probability, e.g. pd < 0.1, while the lower bound ([30]) holds for a much 
wider range. 
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p,i = 









p,i = 


0.01 from 


12] 


- 


p>i = 


0.01 ill (30 


) 




p,i = 


0.02 from 


12] 




Pd = 


0.02 in (30 


) 




p,i = 


0.03 from 


12] 


- e - 


Pd = 


0.03 in (30 


) 



4 5 6 

SNR (ilB) 



10 



Fig. 3. Comparison between the lower bound {30\ for n — 1000 with the lower bound in 1111 versus SNR for different deletion probabilities. 



C. Random Insertion Channel 

We now numerically evaluate the lower bounds derived on the capacity of the random insertion channel. Similar 
to the previous cases, different values of n result in different lower bounds. In Table |ll] and Fig. |4| we compare the 
lower bound in Eqn. ( [4T] ) with the Gallager lower bound (1 — H},{pi)), where the reported values are obtained for 
the optimal value of n. We observe that for larger pi, smaller values of n give the tightest lower bounds. This is not 

TABLE II 

Lower bounds on the capacity of the random insertion channel (In the left table "1 -lower bound" is reported) 



Pi 


1-LB from [3] 


1-LB (41) 


optimal 
value of n 


10-*^ 


2.14 X 10"^ 


2.007 X 10^ 


121 


10"^ 


1.81 X 10~** 


1.68 X 10 


57 


10-4 


1.47 X 10"^ 


1.35 X 10^ 


27 


10"^ 


1.14 X 10"^ 


1.02 X 10^ 


13 


10-^ 


8.07 X 10-^ 


7.14 X 10^ 


7 



Pi 


LB from Q 


LB (41 1 


optimal 
value of n 


0.03 


0.8056 


0.8276 


5 


0.05 


0.7136 


0.7442 


5 


0.10 


0.5310 


0.5702 


4 


0.15 


0.3901 


0.4230 


4 


0.20 


0.2781 


0.2962 


3 


0.23 


0.2220 


0.2283 


3 


0.25 


0.1887 


0.1853 


3 



unexpected since in upper bounding H{Y\X), we computed the exact value of p{y\x) for at most one insertion, 
i.e., \y\ = \x\ or \y\ = \x\ + 1, and upper bounded the part of the conditional entropy resulting form more than 
one insertion. Therefore, for a fixed by increasing n, the probability of having more than one insertion increases 



and as a result the upper bound becomes loose. We also observe that the lower bound (41 1 improves upon the 
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Random Insertion Channel 




0.05 0.1 0.15 0.2 0.25 

Pi 

Fig. 4. Comparison of the lower bound l |41| l with lower bound presented in l3l . 

Gallager's lower bound |i3j| for pi < 0.25, e.g., for pi = 0.1, we achieve an improvement of 0.0392 bits/channel 
use which is significant. 

VI. Conclusion 

We have presented several analytical lower bounds on the capacity of the insertion/deletion channels by lower 
bounding the mutual information rate for i.u.d. input sequences. We have derived the first analytical lower bound on 
the capacity of the deletion- AWGN channel which for small values of deletion probability is very close to the existing 
simulation based lower bounds. The lower bound presented on the capacity of the deletion-substitution channel 
improves the existing analytical lower bound for all values of deletion and substitution probabilities. For random 
insertion channel, the presented lower bound improve the existing ones for small values of insertion probability. For 
Pe = 0, the presented lower bound on the capacity of the deletion-substitution channel results into a lower bound 
on the capacity of the deletion channel which for small values of deletion probability, is very close to the tightest 
presented lower bounds, and is in agreement with the first order expansion of the channel capacity for pd — )■ 0, 
while our result is a strict lower bound for all range of p^. 
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Appendix A 
Part of Proof of Proposition[2] 



H Y' 



= Piy' (n -d)\x) log {P{y'{n-d)\x)) 

n 

^ P{y'\D*x)P{D\x)\og{P{y'\D*x)P{D\x)) , (55) 



where the inequaUty is obtained from the expression in p3] ). Furthermore, by employing the results from Eqns. ( |T9| ) 
and ([20]) and using the fact that there are {"'~'^) , distinct output sequences of length n—d resulting from e substitution 
errors into a given input x, i.e., e = dn {y'{n — d); D{n] K; d) * x{n] K)), we arrive at 



h( Y' 



n n—j 

x{b;n;Kn]<-Y,Y 

j=0 e=0 

X log 



n-j 
e 



jlH hjjf=i 

n 



Jl 



K 
JK 



n-j 



E E 



JkJ 



K 
JK 



+ log 



Jl 



K \^ 
JK 



piii-PdT-' 

p'di^-p^r-' 



(n - j)Hb{pe) 



n 



Y 

j=Oji+-+jK 



''^'']p^d(^-Pdr-' 

jK ' 



n{l -pd)Hb{pe) 



+ log 



Jl 



n 



K 
JK 



(56) 



Using the generalized Vandermonde's identity, that is, ^ 



(") , and the result 



E 

Jl + ...+jjfX=j 



Jl 



n 



K' 
JK^ 



log 



Jl 



n 



K" 
JK^ 



E 



K" 
JK- 



k=l 



ni 



Jk 



K- j 

EE 

k=ljk=0 



nt\ n- ni 



JkJ \J - Jk 



log 



ni 



Jk 



we obtain 



H\ Y 



K- j 



x{b;n;K^)\ < + n{l - p^Wpe) -Y.P>^{1 - PdT-' Y.Y 



j=0 



fc=li&=0 
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Appendix B 
Proof of Proposition g] 

For an i.i.d. deletion-AWGN channel, for a given x{b;n;K) and a fixed d, we liave 



fy{r]\x{b;n;K),d) = J] fy{v\x{b;n; K), D)PiD\xib;n- K)) 
= fy4vHD,x))P{D\x{b;n;K)) 

D&Vl-{d) 

= XI fyi...yt_S^i-Vn-d\ai-an-d)P{D\x{b;n;K)) 
Dev-^id) 

= X] fytiVi\o^i)--fy^-aiVn-d\an-d)P{D\x{b;n;K)), (58) 

where a{D,x) = 1 — 2{D * a;), i.e., ai{D,x) G {1,-1}, and the last equality follows the fact that the noise 
samples jZj's are independent and ai{D,x)'s are also independent. By employing 



hA'ni\oii{D,x)) = exp ' 

V27r(T 



2a2 



and P D{n;K;d) 



x{b; n;K),d 



fy{7]\x{b;n;K),d) 



1 



, we can write 



n—d 



(V2^cj) 



n~d 



E n"p( ""''"2;i"'""" ip(°i-(''^"^^-).''> 



(di)-GK) TT_„ f-{r]i-ai{D,x)y 

+dK=d ^dJ i=l 



2(72 



(59) 



Therefore, we obtain 
/i(i^|a;,d) = - 
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X loe 



(V27ro-)"-'* 
(n — d) log(A/27ro-) + log 



idO-idOj-f ( -{m-oii{D\x)f\ 
2^ 7TA ii exp ( ) I dr]i...dr]n-d 



d[+...+d'j^ 

n 
d 



i=l 



2a2 



oo roo 



CO •! — OO 



1 



fni\ (nK\ n-d 

E Hi^n-p 



E 

d{-^...+d'j^=d 
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-(Vi - ai{D,x)f 



ni\ I UK 
d'J-\d 



:f ) n 
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2^2 
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drji...dr]n-d, (60) 
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where we used the result of the generaUzed Vandermonde's identity and also the fact that fy'^{'ni\yi)dr]i = 1- 
By using the inequality 
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d'i + ...+d'if=d 



riK 
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i=l 



exp 
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2(t2 
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ni 



d-K 



n—d 



nK \ TT _____ / -{fli - ai{D,x)Y 



4 = 1 



2a2 



which holds for every di + ... + dx = d, ^nq can write 



h{Y\x,d) < (n - d) log(\/2^(T) + log 



n 



OO /'OO 



OO J —OO 



1 



log 



ni \ / TlK 
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By considering i.u.d. input sequences, we have 

d=0 ^ 



X&X 



< n(l-pd)log(^/27rea) + ^Qp^(l-pd 



log 



n 



Wdin) 



(61) 



(62) 



where VFrf(n) is given in Eqn. ([8]), and the result is obtained by following the same steps as in the computation 



leading to p5] ). Therefore, by substituting Eqn. ( [62] ) into Eqn. p9] ), Eqn. p8] ) is obtained which concludes the 
proof. 
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